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Introduction 


In  2011,  2013  and  2016  the  National  Digital  Stewardship  Alliance  (NDSA)  conducted  surveys  of  U.S. 
organizations  currently  or  prospectively  engaged  in  web  archiving  to  better  understand  the  landscape: 
similarities  and  differences  in  programmatic  approaches,  types  of  content  being  archived,  tools  and 
services  being  used,  access  modes  being  provided,  and  emerging  best  practices  and  challenges. 

The  resulting  reports  are  available  here: 

2011  NDSA  web  archiving  survey 

2013  NDSA  web  archiving  survey 

2016  NDSA  web  archiving  survey 

The  NDSA  is  releasing  this  updated  survey  to  continue  to  track  the  evolution  of  web  archiving  programs  in 
the  United  States.  The  aggregate  responses  will  be  reported  to  NDSA  members  and  summary  results  will 
be  shared  publicly. 

Some  questions  refer  to  the  Web  Archiving  Life  Cycle  model,  Here's  a  reminder  of  that  model's  explanation 
before  we  get  into  the  survey  https://archive-it.org/static/files/archiveit  life  cycle  model.pdf 

Web  Archiving  Life  Cycle  model 
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Please  allow  15  minutes  to  complete  the  survey.  Thank  you  for  your  participation! 
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About  Your  Organization 


*  1.  Name  of  your  organization 


*  2.  Type  of  organization 

Archive 

Historical  Society 
College  or  University 
Museum 
Public  Library 
Commercial 
Consortium 
O  K12  School 

Government:  Federal 
Government:  State 
Government:  Local 
Other  (please  specify) 


3.  Do  you  or  your  organization  belong  to  any  of  these  three  groups?  Choose  all  that  apply. 

Digital  Library  Federation  (DLF)  https://www.dialib.org/ 

International  Internet  Preservation  Consortium  (UPC)  http://netpreserve.org 

National  Digital  Stewardship  Alliance  (NDSA)  http://diaitalpreservation.gov/ndsa 

Society  of  American  Archivists  Web  Archiving  Section  https://www2.archivists.org/groups/web-archiving- 
section  https://www2.archivists.ora/aroups/web-archiving-section 
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*  4.  What  is  the  status  of  your  web  archiving  activity?  Choose  one. 

Planning  /  considering  archiving  but  haven't  started  yet 
Pilot  /  testing 

Production  /  actively  capturing 

Have  collected  content  in  the  past  but  aren’t  currently  collecting 
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Archiving  Program  Information  -  part  1 


*  5.  How  does  the  state  of  your  organization's  web  archiving  program  compare  to  what  it  was  two 
years  ago?  Choose  one. 

Significant  progress 

Some  progress 

About  the  same 

Slightly  worse  off 

Much  worse  off 

6.  Other  or  comment  (use  this  box  to  provide  an  alternate  answer  or  commentary  on  your  answer  to 
question  5) 
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As  a  quick  reminder  for  the  questions  below,  here  is  a  graphic  representation  of  the  Web  Archiving 
Lifecycle  model: 


*  7.  On  what  dimensions  of  the  Web  Archiving  Lifecycle  Model  ftttps://archive- 
it.orq/static/files/archiveit  life  cycle  model.pdf)  has  your  organization  made  the  most  progress? 
Choose  three. 

Vision  and  Objectives 

Policy 

□  Resources  and  Workflow 
Risk  Management 
Appraisal  and  Selection 
Scoping 

Data  Capture 

□  Quality  Assurance  and  Analysis 
Storage  and  Organization 
Preservation 

Metadata  /  Description 

□  Access  /  Use  /  Reuse 
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*  8.  On  what  dimensions  of  the  Web  Archiving  Lifecycle  Model  (ittps://archive- 
it.org/static/files/archiveit  life  cycle  model.pdf)  has  your  organization  made  the  least  progress? 
Choose  three. 

Vision  and  Objectives 

Policy 

Resources  and  Workflow 
Risk  Management 
Appraisal  and  Selection 
Scoping 
Data  Capture 

□  Quality  Assurance  and  Analysis 
Storage  and  Organization 
Preservation 
Metadata  /  Description 
Access  /  Use  /  Reuse 

*  9.  What  are  the  goals  of  your  web  archiving  activity?  Choose  all  that  apply. 

Archive  your  own  or  affiliated  web  content  (e.g.,  university  archives  archiving  the  university  website  or  state  library 
archiving  state  agency  websites) 

Archive  content  from  other  organizations  or  individuals  (e.g.,  research  library  archiving  third-party  web  content  as  part 
of  topical  collection  building  or  special  collections  department  archiving  web  content  associated  with  a  manuscript 
donor) 

Other  (please  specify) 


10.  Optional:  briefly  share  any  driving  factors  which  determine  your  collecting  goals: 


11.  What  year  did  your  organization  begin  archiving  web  content? 
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*  12.  How  much  full-time  equivalent  (FTE)  staff  time  does  your  organization  dedicate  to  web 
archiving?  Choose  one. 

For  example,  if  your  organization's  web  archiving  activity  relied  upon  one  dedicated,  full-time 
employee  and  50%  of  the  time  of  three  other  full-time  employees,  you  would  indicate  "more  than  1, 
less  than  3". 

O  25 
O  5 
O  75 
O  1 

more  than  1,  less  than  3 
3  or  more 

*  13.  What  are  the  top  considerations  for  the  development  of  your  web  archiving  program?  Choose 
three. 

Access  and  use  (e.g.,  researcher  interactions,  web  analytics,  use  cases) 

Cost  (e.g.,  budgeting,  service  allowance  utilization,  staffing  level  requirements) 

Data  volume  (e.g.,  data  volume  collected,  objects  collected,  acquisitions  statistics) 

Institutional  buy-in  (e.g.,  programmatic  growth,  stakeholder  testimonials,  resource  commitments) 

Loss  (e.g.,  link  and/or  reference  rot  of  archived  resources) 

Quality  (e.g.,  accuracy,  completeness,  replay  fidelity) 

Risk  management  (e.g.,  permission  responses,  takedown  requests,  policy  conformance) 

Other  (please  specify) 
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*  14.  What  are  the  top  staff  skills  that  are  essential  to  the  development  and  success  of  web  archiving 
in  your  organization?  Choose  three. 

Appraisal  and  selection  (e.g.,  determining  what  web  content  to  collect) 

Archiving  tools  (e.g.,  configuring  or  operating  web  archiving  tools) 

Collaboration  and  communication  (e.g.,  advocacy,  coordination,  marketing,  or  outreach) 

Domain  expertise  (e.g.,  knowledge  of  subjects  that  are  the  focus  of  web  archiving) 

Metadata  (e.g.,  familiarity  with  metadata  standards,  cataloging  experience) 

Quality  assurance  (e.g.,  analyzing  and  troubleshooting  web  archive  quality  issues) 

Software  development  (e.g.,  able  to  develop  software  or  web  applications) 

Web  technologies  (e.g.,  familiarity  with  web  architecture,  design,  formats,  or  platforms) 

Other  (please  specify) 


*  15.  What  types  of  content  do  you  have  concerns  about  your  capacity  to  archive?  Choose  all  that 
apply. 

Audio 

Blogs 

Databases 

Interactive  media 

Social  media 

Video 

Other  (please  specify) 

*  16.  Do  you  currently  collaborate  with  any  other  institution  on  any  area  of  web  archiving? 

Yes 

O  No 

No,  but  interested 
Don't  know 
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17.  If  you  answered  yes  to  question  16,  on  which  areas  of  web  archiving  do  you  collaborate  with 
another  institution?  Choose  all  that  apply. 

Best  practices  for  policy  and  risk  management 

Capture  configuration  and  optimization 

Collaborative  collection  development 

Input  on  APIs  and  standards 

Metadata  standards  and  application 

Quality  assurance  techniques  and  strategies 

Tool  development,  documentation,  or  user  feedback 

Other  (please  specify) 

*  18.  In  what  areas  of  web  archiving  are  you  most  interested  in  collaborating?  Choose  all  that  apply. 

Best  practices  for  policy  and  risk  management 

Capture  configuration  and  optimization 

Collaborative  collection  development 

Input  on  APIs  and  standards 

Metadata  standards  and  application 

Quality  assurance  techniques  and  strategies 

Tool  development,  documentation,  and  user  feedback 

Other  (please  specify) 

*  19.  What  barriers  do  you  face  for  collaborating  on  web  archiving?  Choose  all  that  apply. 

Still  in  planning  /  pilot  stage;  not  much  to  share 

Lack  of  institutional  support 

Lack  of  time  to  spend  on  collaborating 

Using  a  proprietary  system 

Institutional  policies  prohibit  or  limit  collaboration 

Other  (please  specify) 
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Tools  and  Service  Providers 


20.  If  capturing  locally,  what  tool(s)  do  you  use?  Choose  all  that  apply. 

Adobe  Web  Capture 

Grab-a-Site 

Heritrix 

HTTrack 

Teleport  Pro 

Web  Archiving  Integration  Layer  (WAIL) 

Web  Curator  Tool 

WebRecorder 

Wget 

Other  (please  specify) 

*  21.  Are  you  capturing  social  media  utilizing  API  tools? 

Yes 

O  No 

22.  If  you  answered  yes  to  question  21,  please  list  the  tools  you  use 
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23.  If  you  are  (or  have  been)  using  an  external  service  for  data  capture,  which  one(s)  do  you  use? 
Choose  all  that  apply. 

Archive-lt 

Hanzo  Archives 

Internet  Archive's  contract  crawling  services 
OCLC  Web  Harvester 
Other  (please  specify) 


24.  If  you  are  (or  have  been)  using  an  external  service  for  data  capture,  have  you  replicated  any  of 
your  data  to  another  repository? 

Yes 

O  No 

25.  If  you  answered  yes  to  question  24,  to  what  kind  of  repository  have  you  replicated  the  data? 
Choose  all  that  apply. 

Local  repository 

External  preservation  service  provider 

26.  If  you  have  not  replicated  any  of  your  data  to  another  repository,  why  not?  Choose  all  that 
apply. 

Trust  web  archiving  service  provider 

Building  local  infrastructure 

No  place  to  store  /  maintain  it 

Not  sure  what  we’d  do  with  it  once  we  got  it 

Other  (please  specify) 
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Access  and  Discovery 
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27.  What  kind(s)  of  access  does  your  organization  itself  provide?  Choose  all  that  apply. 

URL  search 
Full-text  search 
□  Browse  list  by  URL 
Browse  list  by  title 

Catalog  records:  collection-level  description 
Catalog  records:  item-level  description 
Finding  aids 

Application  programming  interfaces  (APIs) 

WARC  files 
Derivative  datasets 
Other  (please  specify) 


*  28.  Do  you  have  active  researchers  utilizing  your  web  archives? 

Yes 

O  No 

I  don't  know 

29.  If  you  answered  yes  to  question  28,  could  you  provide  a  summary  of  how  researchers  are  using 
your  web  archives? 
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30.  Please  indicate  your  typical  approach  regarding  notifying  and  seeking  permission  from  content 
owners  when  capturing  and  providing  access  to  their  web  content,  (choose  one  for  each  row) 


No  action 

Notifying 

Requesting  permission 

Capturing 

O 

O 

O 

Providing  restricted 

access 

O 

O 

O 

Providing  public 

access 

o 

o 

o 

Other  (please  specify) 


*  31.  Have  you  ever  had  a  request  to  stop  collecting  or  to  take  down  content  that  you've  crawled  or 
made  accessible  without  explicit  permission? 

Yes 

O  No 

If  yes,  please  specify 
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Archiving  Policies 


32.  Please  indicate  your  typical  approach  regarding  notifying  and  seeking  permission  from  content 
owners  when  capturing  and  providing  access  to  their  web  content.  Choose  one  for  each  row. 


No  action 

Notifying 

Requesting  permission 

Capturing 

O 

O 

O 

Providing  restricted 

access 

O 

O 

O 

Providing  public 

access 

o 

o 

o 

Other  (please  specify) 
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*  33.  Is  any  of  the  web  content  archived  by  your  organization  embargoed  as  a  matter  of  policy  before 
being  made  accessible?  Choose  one. 

Yes 

O  No 

Considering  whether  to  implement  embargo  and/or  what  length 
Not  applicable  (i.e.,  dark  archive) 

34.  If  you  selected  "Yes"  for  question  33,  how  long  is  your  organization's  access  embargo?  Choose 
one. 

Less  than  6  months 
6  months  up  to  1  year 
1-2  years 
2+  years 

Other  (please  specify) 

i  i 

*  35.  Does  your  organization  have  archiving  policies  specifically  related  to  social  media? 

Yes 

O  No 

36.  If  you  answered  yes  to  question  35,  would  you  be  willing  provide  a  link  to  or  provide  a  summary 
of  your  approach? 


*  37.  Do  you  respect  robots.txt  when  capturing?  Choose  one. 

Always 

Never 

Sometimes  /  it  depends 
Don’t  know 
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38.  If  you  selected  "Sometimes  /  it  depends"  for  question  37,  under  what  circumstances  does  your 
organization  ignore  robots.txt?  Choose  all  that  apply. 


If  your  organization  owns  the  copyright/has  some  other  special  access  right  (e.g.,  state  archive  collecting  state  agency 
websites) 

If  permission  is  secured  or  appropriate  notices  have  been  sent 
In  order  to  capture  essential  content  (e.g.,  stylesheets,  images,  etc.) 

Other  (please  specify) 


39.  What  resources  has  your  organization  relied  upon  in  the  development  of  its  own  copyright  and 
access  policies?  Choose  all  that  apply. 

ARL  Code  of  Best  Practices  in  Fair  Use  for  Academic  and  Research  Libraries  http://www.arl.org/focus-areas/copvriaht- 
ip/fair-use/code-of-best-practices 

Oakland  Archive  Policy 

http://web.archive.org/web/20140812200246/http://www2.sims.berkelev.edu/research/conferences/aps/removal- 

policv.html 

Section  108  Study  Group  Report  http://www.sectionl08.gov/ 

Consultation  with  legal  counsel 
□  Statutory  authority 

Web  archiving  policies  and  practices  of  other  organizations 
Previous  NDSA  Web  Archiving  Survey  reports 
Other  (please  specify) 

If  you  are  willing  to  share  your  web  archiving  policies,  please  send  a  copy  or  provide  a  lirk  to 
ndsa@dialib.org. 
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Thank  you  for  participating  in  the  2017  NDSA  Web  Archiving 

Survey! 

Results  will  be  shared  in  2018. 
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